importance estimator
Selective Social-Interaction via Individual Importance for Fast Human Trajectory Prediction
Urano, Yota, Taketsugu, Hiromu, Ukita, Norimichi
This paper presents an architecture for selecting important neighboring people to predict the primary person's trajectory. To achieve effective neighboring people selection, we propose a people selection module called the Importance Estimator which outputs the importance of each neighboring person for predicting the primary person's future trajectory. To prevent gradients from being blocked by non-differentiable operations when sampling surrounding people based on their importance, we employ the Gumbel Softmax for training. Experiments conducted on the JRDB dataset show that our method speeds up the process with competitive prediction accuracy.
Reviews: A Benchmark for Interpretability Methods in Deep Neural Networks
Summary --- This paper proposes to evaluate saliency/importance visual explanations by removing "important" pixels and measuring whether a re-trained classifier can still classify such images correctly. Many explanations fail to remove such class-relevant information, but some ensembling techniques succeed by completely removing objects. Those are said to be better explanations. This paper takes the view that important information is that information which a classifier can use to predict the correct label. As a result, we can measure whether an importance estimate is good by measuring how much performance drops when the important pixels are removed from all images in both train and val sets.
Feature Perturbation Augmentation for Reliable Evaluation of Importance Estimators in Neural Networks
Brocki, Lennart, Chung, Neo Christopher
Post-hoc explanation methods attempt to make the inner workings of deep neural networks more interpretable. However, since a ground truth is in general lacking, local post-hoc interpretability methods, which assign importance scores to input features, are challenging to evaluate. One of the most popular evaluation frameworks is to perturb features deemed important by an interpretability method and to measure the change in prediction accuracy. Intuitively, a large decrease in prediction accuracy would indicate that the explanation has correctly quantified the importance of features with respect to the prediction outcome (e.g., logits). However, the change in the prediction outcome may stem from perturbation artifacts, since perturbed samples in the test dataset are out of distribution (OOD) compared to the training dataset and can therefore potentially disturb the model in an unexpected manner. To overcome this challenge, we propose feature perturbation augmentation (FPA) which creates and adds perturbed images during the model training. Through extensive computational experiments, we demonstrate that FPA makes deep neural networks (DNNs) more robust against perturbations. Furthermore, training DNNs with FPA demonstrate that the sign of importance scores may explain the model more meaningfully than has previously been assumed. Overall, FPA is an intuitive data augmentation technique that improves the evaluation of post-hoc interpretability methods.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > Poland > Masovia Province > Warsaw (0.04)
- Europe > Italy > Marche > Ancona Province > Ancona (0.04)
Fidelity of Interpretability Methods and Perturbation Artifacts in Neural Networks
Brocki, Lennart, Chung, Neo Christopher
Despite excellent performance of deep neural networks (DNNs) in image classification, detection, and prediction, characterizing how DNNs make a given decision remains an open problem, resulting in a number of interpretability methods. Post-hoc interpretability methods primarily aim to quantify the importance of input features with respect to the class probabilities. However, due to the lack of ground truth and the existence of interpretability methods with diverse operating characteristics, evaluating these methods is a crucial challenge. A popular approach to evaluate interpretability methods is to perturb input features deemed important for a given prediction and observe the decrease in accuracy. However, perturbation itself may introduce artifacts. We propose a method for estimating the impact of such artifacts on the fidelity estimation by utilizing model accuracy curves from perturbing input features according to the Most Import First (MIF) and Least Import First (LIF) orders. Using the ResNet-50 trained on the ImageNet, we demonstrate the proposed fidelity estimation of four popular post-hoc interpretability methods.
- Europe > Poland > Masovia Province > Warsaw (0.05)
- North America > United States (0.04)
Evaluation of importance estimators in deep learning classifiers for Computed Tomography
Brocki, Lennart, Marchadour, Wistan, Maison, Jonas, Badic, Bogdan, Papadimitroulas, Panagiotis, Hatt, Mathieu, Vermet, Franck, Chung, Neo Christopher
Deep learning has shown superb performance in detecting objects and classifying images, ensuring a great promise for analyzing medical imaging. Translating the success of deep learning to medical imaging, in which doctors need to understand the underlying process, requires the capability to interpret and explain the prediction of neural networks. Interpretability of deep neural networks often relies on estimating the importance of input features (e.g., pixels) with respect to the outcome (e.g., class probability). However, a number of importance estimators (also known as saliency maps) have been developed and it is unclear which ones are more relevant for medical imaging applications. In the present work, we investigated the performance of several importance estimators in explaining the classification of computed tomography (CT) images by a convolutional deep network, using three distinct evaluation metrics. First, the model-centric fidelity measures a decrease in the model accuracy when certain inputs are perturbed. Second, concordance between importance scores and the expert-defined segmentation masks is measured on a pixel level by a receiver operating characteristic (ROC) curves. Third, we measure a region-wise overlap between a XRAI-based map and the segmentation mask by Dice Similarity Coefficients (DSC). Overall, two versions of SmoothGrad topped the fidelity and ROC rankings, whereas both Integrated Gradients and SmoothGrad excelled in DSC evaluation. Interestingly, there was a critical discrepancy between model-centric (fidelity) and human-centric (ROC and DSC) evaluation. Expert expectation and intuition embedded in segmentation maps does not necessarily align with how the model arrived at its prediction. Understanding this difference in interpretability would help harnessing the power of deep learning in medicine.
- Europe > Poland > Masovia Province > Warsaw (0.04)
- Europe > France > Brittany > Finistère > Brest (0.04)
- Europe > Italy > Marche > Ancona Province > Ancona (0.04)
- (2 more...)